Skip to content

output at flexible time levels#217

Open
guoqing-noaa wants to merge 4 commits intoufs-community:gsl/developfrom
guoqing-noaa:flexible_timelevels4gsl
Open

output at flexible time levels#217
guoqing-noaa wants to merge 4 commits intoufs-community:gsl/developfrom
guoqing-noaa:flexible_timelevels4gsl

Conversation

@guoqing-noaa
Copy link
Collaborator

@guoqing-noaa guoqing-noaa commented Feb 24, 2026

Introduce a new output_timelevels attribute for MPAS streams that enables variable output intervals.

With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days.

We can also use this to only write out forecast files during a given period, such as: output_timelevels="6-12h"

Check the PR description below for details on how to specify the time levles.
Here is a quick example: output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"

FYI, with this PR, we may write out mpasout files at "0 1 2 3 6 9 12", while history files at "4 5 7 8 11 13-99999" to avoid duplicate output of both mpasout and history files at the same time levels, which are almost the same.

This PR solves issue #214

Priority Reviewers

…ttribute

Introduce a new `output_timelevels` attribute for MPAS streams that enables variable output intervals.
With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days.
We can also use this to only write out forecast files during a given period, such as: output_timelevels="6-12h"

Check the PR description for details on how to specify the time levles.
Here is a quick example: output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"
@guoqing-noaa
Copy link
Collaborator Author

output_timelevels Specification

The forecast output times are defined by a space-separated list of sections:

<section> [section] [section] ...

Each section expands into one or more forecast times.
The final output is the union of all expanded times.
Users need to make sure the times are in a ascending order.

1. Time String Format

A time_string represents a forecast time (offset from the initialization time).
It may be written in one of the following forms:

1.1 Integer Forms (Hours)

An integer with no suffix represents hours:

6     → 6 hours
12    → 12 hours
0     → 0 hours

1.2. Integer With Unit Descriptors

A duration may be written using unit suffixes:

h  → hours
m  → minutes
s  → seconds
D  → days

Examples:

1h30m   → 1 hour 30 minutes
45m     → 45 minutes
90s     → 90 seconds
6h15m   → 6 hours 15 minutes

2. Sequence Expansion (Range Expression)

A section may define a regularly spaced sequence using:
start-stop-step
This generates values beginning at start, incremented by step, and ending at stop (inclusive if exactly reached).
Examples:

0-1h-15m

expands to:

0 15m 30m 45m 1h

If -step is omitted, a default step of 1 hour is used:
7-12 means 7-12-1 and expands to 7 8 9 10 11 12

Note: Integer and unit-based forms may be freely mixed

3. The EBNF (Extended Backus-Naur Form) style grammar:

specification   = section , { SP , section } ;

section         = range
                | time_string ;

range           = time_string , "-" , time_string , [ "-" , time_string ] ;
                  (* start-stop[-step] *)

time_string     = integer
                | duration ;

duration        = duration_part , { duration_part } ;

duration_part   = integer , unit ;

unit            = "h" | "m" | "s" | "D" ;

integer         = digit , { digit } ;

digit           = "0" | "1" | "2" | "3" | "4"
                | "5" | "6" | "7" | "8" | "9" ;

SP              = " " , { " " } ;

@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Feb 24, 2026

The following situations have been tested:
Note: extra spaces were intentionally added to test whether this PR can handle them corrently.

output_timelevels="0  1     "
output_timelevels="0h  1h   "

output_timelevels="0-1-15m 2-6-2"
output_timelevels="2-6-2"
output_timelevels="2-6-2   8-999"
output_timelevels="2-6h-2   8-999h"

output_timelevels="0m 15m  1h    1h3m"
output_timelevels="0s 30s 15m  1h    1h3m"
output_timelevels="0m 30s 15m  1h    1h3m"

@guoqing-noaa guoqing-noaa changed the title Enable variable output intervals via the output_timelevels stream attrbute output at flexible time levels Feb 24, 2026
@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Feb 24, 2026

For a conus12km 12 hours forecasts, I tried different settings for the da_state stream (no changes on the history and diag outputs)

the output_interval="1:00:00" run took 606s
the output_timelevels="0-999" run took 609s
This 3s cost saves more time due to the skipping of mpasout output at hours 4-12:
  the output_timelevels="0-3" run took 549s

Copy link
Collaborator

@SamuelTrahanNOAA SamuelTrahanNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code changes are overcomplicated and will be difficult to extend or maintain. Any significant changes to the functionality will require a complicated rewrite.

More specific overview comments:

  1. Most of the parser code is a reimplementation of the Fortran standard split function. It would be better to call split.
  2. Much of the time-related code reimplements the MPAS and ESMF alarm functionality. It would be better to extend the alarm functionality.
  3. Parser errors aren't reported. If you type something wrong, you don't know what it was.
  4. There are out-of-bounds array reads due to off-by-one errors.
  5. The new syntax deviates from the MPAS days_HH:MM:SS time specification.
  6. Unlike the rest of the MPAS I/O interface, this uses a cryptic string instead of XML tags. If you used XML tags, the parser code wouldn't be necessary at all.

@SamuelTrahanNOAA
Copy link
Collaborator

The EBNF (Extended Backus-Naur Form) style grammar

If you want to use a parser generator, the original grammar specification should be in the repository, not the automatically-generated code.

@guoqing-noaa
Copy link
Collaborator Author

The EBNF (Extended Backus-Naur Form) style grammar

If you want to use a parser generator, the original grammar specification should be in the repository, not the automatically-generated code.

No, I don't intend to use a parser generator. I put EBNF there to make sure the overall timelevel specification logic is consistent and complete, and there are no hidden surprises.

@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Feb 24, 2026

These code changes are overcomplicated and will be difficult to extend or maintain. Any significant changes to the functionality will require a complicated rewrite.

More specific overview comments:

  1. Most of the parser code is a reimplementation of the Fortran standard split function. It would be better to call split.
  2. Much of the time-related code reimplements the MPAS and ESMF alarm functionality. It would be better to extend the alarm functionality.
  3. Parser errors aren't reported. If you type something wrong, you don't know what it was.
  4. There are out-of-bounds array reads due to off-by-one errors.
  5. The new syntax deviates from the MPAS days_HH:MM:SS time specification.
  6. Unlike the rest of the MPAS I/O interface, this uses a cryptic string instead of XML tags. If you used XML tags, the parser code wouldn't be necessary at all.

@SamuelTrahanNOAA Thanks for the comments and discussions!

This is the first version to get things work. I also feel the implementation is kind of complicated and I would like to descope to meet our current needs only. My thought is to limit the timelevel specifications to only support minutes and hours (may also limit the output to have to start at 0h).

I intentionally try to avoid using the same time format HH:MM:SS used in output_interval.
start-stop[-step] is the most appropriate format after lots of considerations. It is more user friendly. start-stop is used in lots of industry applications (I just added the [-step] part)

@SamuelTrahanNOAA
Copy link
Collaborator

I put EBNF there to make sure the overall timelevel specification logic is consistent and complete, and there are no hidden surprises.

It is, indeed, a finely-polished EBNF with perfect indentation. However, it doesn't describe the code. Your read statement reads a real, not an integer.

@SamuelTrahanNOAA
Copy link
Collaborator

start-stop[-step] is the most appropriate format after lots of considerations. It is more user friendly. start-stop is used in lots of industry applications (I just added the [-step] part)

What is more user-friendly is splitting them into individual XML tags so it is clear what is going on. That would use the existing MPAS parsing code (eliminating the parser). Also, it'll be less confusing to experienced MPAS users.

<output_interval start="01:15:00" stop="02:30:00" step="00:15:00"/> <!-- every 15 minutes from 1:15 to 2:30 -->
<output_interval start="06:00:00"/> <!-- Only at hour 6 -->

I can see how your string would be easier to pass through the rrfs-workflow bash scripts, but bash limitations shouldn't take precedence over MPAS code consistency and quality.

@SamuelTrahanNOAA
Copy link
Collaborator

Can you please provide:

  1. Documentation of what part of the code was AI-generated. (Via code comments, perhaps.)
  2. Documentation of what part of the PR description and comments were AI-generated.
  3. Explanation of how it was generated.
  4. License restrictions the AI company placed on its generated code, description, and comments.

@guoqing-noaa
Copy link
Collaborator Author

@clark-evans I am sorry, this may not be ready for an official review yet. I should have put this in the draft mode since the beginning.

@guoqing-noaa guoqing-noaa marked this pull request as draft February 24, 2026 21:05
@guoqing-noaa
Copy link
Collaborator Author

<output_interval start="01:15:00" stop="02:30:00" step="00:15:00"/> <!-- every 15 minutes from 1:15 to 2:30 -->
<output_interval start="06:00:00"/> <!-- Only at hour 6 -->

This looks good. Does current MPAS code support this?

@SamuelTrahanNOAA
Copy link
Collaborator

This looks good. Does current MPAS code support this [xml alternative in Sam's prior comment]?

No. I'm suggesting it as an alternative implementation to your recursive descent parser.

@guoqing-noaa
Copy link
Collaborator Author

@SamuelTrahanNOAA Thanks for lots of great discussions! For the moment, this PR is mainly for a test/demo purpose.
Appreciate your specific comments so far. But we may pause further reviewing. This will NOT be the final implementation if moving forward. General discussions may continue if your are interested and have time to think about more on this.

I think my initial need is much simpler that the goal in this PR. We want to output mpasout files only at limited time levels (such as 01, 02 h (or including 0h if needed)).
My thought went much further than needed, mainly driven by my latest relevant work on the rrfs-workflow side.

For our current needs, I think NCAR Andy Stokely's changes may work. I will test that first. I was not aware of Andy's changes until 2 days ago.

@SamuelTrahanNOAA
Copy link
Collaborator

WRF was limited to start...step...end for each stream. We were able to work around that by having multiple output streams. (Recall that the operational models HRRR, HWRF, NAM, and RAP were all WRF-based, along with other quasi-operational models.)

@dustinswales
Copy link
Collaborator

@guoqing-noaa @SamuelTrahanNOAA
FWIW. To control output frequency for MPAS in the UFS we are using output_fh for native MPAS output, which could be configured to handle the need here.

For example, output_fh = 0.2 0.4 0.6 0.8 1 2 3 6 9 12, would provide output every 12min for the first hour, every hour for the next three hours, and every three hours for the remainder of the simulation.

@guoqing-noaa
Copy link
Collaborator Author

@guoqing-noaa @SamuelTrahanNOAA FWIW. To control output frequency for MPAS in the UFS we are using output_fh for native MPAS output, which could be configured to handle the need here.

For example, output_fh = 0.2 0.4 0.6 0.8 1 2 3 6 9 12, would provide output every 12min for the first hour, every hour for the next three hours, and every three hours for the remainder of the simulation.

@dustinswales It is good to know this capability in UFS.
A few questions:

  1. MPAS will compute the output interval automatically and use the latest value if no more new intervals and so we don't need to specify all time levels to the end of the forecast, right?
    But how output_fh handles the situation "every 15 minutes in the first hour, every hour in the first 3 days (to 72h), every 3 hours for the next 4 days (up to 168hours), and every 6 hours in the last 3 days (up to 240h)"?

  2. UFS output will only consider weather simulation and not climate simulation which may cover many years?

  3. UFS output skips all the src/framework part and does not use streams, right?

Thanks!

@dustinswales
Copy link
Collaborator

  1. MPAS will compute the output interval automatically and use the latest value if no more new intervals and so we don't need to specify all time levels to the end of the forecast, right?
    But how output_fh handles the situation "every 15 minutes in the first hour, every hour in the first 3 days (to 72h), every 3 hours for the next 4 days (up to 168hours), and every 6 hours in the last 3 days (up to 240h)"?

I believe output_fh would be just need to be really long in this case,output_fh: 0.25 0.50 0.75 1 2 3 4 5 .... 72 75 78 .... 168 174 180 ... 240. I'm sure we could find a way to distill this information if it become a pain.

  1. UFS output will only consider weather simulation and not climate simulation which may cover many years?

This is mostly true.
Folks have run S2S experiments, ~months-to-year, but I expect that their output was on some regular interval that was straightforward to configure.

  1. UFS output skips all the src/framework part and does not use streams, right?

This is going to be the biggest change for Standalone users when moving to inline MPAS.

We use the MPAS framework for native input/output, but we do not use the MPAS stream manager as is done in MPAS standalone.
Instead, we use the pieces from MPAS stream manager that we need, like MPAS_readstream and MPAS_writesteam, to read/write data to/from the MPAS streams. These streams are defined within the MPAS subdriver, in addition to the Registry.xml file.

@SamuelTrahanNOAA
Copy link
Collaborator

FYI: NOAA GSL doesn't do climate or seasonal research. Hence, the capability to output on multi-week timescales won't affect our work. It is a good question to ask, though, since others in NOAA do seasonal forecasts. I'd expect forecasts out to 90 days for some applications, just not ours. (I just wanted to clarify things since the word "climate" showed up in discussion.)

@guoqing-noaa
Copy link
Collaborator Author

I believe output_fh would be just need to be really long in this case,output_fh: 0.25 0.50 0.75 1 2 3 4 5 .... 72 75 78 .... 168 174 180 ... 240. I'm sure we could find a way to distill this information if it become a pain.

@dustinswales I think it is totally fine to enumerate all output hours inside the model. However, it will be very beneficial for users for specify this using a simplified interface, for example, output_fh="0-3-0.25 4-72-1 75-168-3 174-240-6".

- parse the output_timelevels string only once at the beginning and store the results into MPAS_timelevel_spec_type for future use
- adopt the mpas_split_string(...) function to do string parsing (Fortran 2023's split is NOT used as it is very new and may not be widely supported by compilers)
@guoqing-noaa guoqing-noaa marked this pull request as ready for review March 16, 2026 03:46
@guoqing-noaa
Copy link
Collaborator Author

guoqing-noaa commented Mar 16, 2026

This PR is ready for review now.
Based on Sam's feedback, I have made the following changes:

  1. parse the output_timelevels string only once at the beginning and store the results into MPAS_timelevel_spec_type for future use
  2. adopt the mpas_split_string(...) function to do string parsing

(Fortran 2023's split is NOT used here as it is very new and may not be widely supported by compilers)

All the previous tests listed in post#3 worked as expected.

We plan to use this PR for the upcoming Spring Forecast Experiment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants